MIT  LIBRARIES 


3  9080  02618  1096 


ii 


:')'.lHiii{lii:'i!l'.!HiJiHi^UliiJOimiifc:iiJii^m^itJ!^'^'J^^i-'^'-^^-- - 


HB31 

.M415 

no.06-27 

2006 


Digitized  by  the  Internet  Archive 

in  2011  with  funding  from 

Boston  Library  Consortium  Member  Libraries 


http://www.archive.org/details/progressiveestatOOfarh 


OPiA/Ey 


HB31 
.M415 

^7 


Massachusetts  Institute  of  Technology 

Department  of  Economics 

Working  Paper  Series 


PROGRESSIVE  ESTATE  TAXATION 


Emmanuel  Farhi 
Ivan  Werning 


Working  Paper  06-27 
September  25,  2006 


Room  E52-251 

50  Memorial  Drive 

Cambridge,  MA  021 42 


This  paper  can  be  downloaded  without  charge  from  the 

Social  Science  Research  Network  Paper  Collection  at 

http://ssrn.com/abstract=932764 


LIBRARIES 


Progressive  Estate  Taxation 


Emmanuel  Farhi  Ivan  Werning 

MIT  MIT 

ef arhiSmit .edu  iwerningOmit . edu 


September  2006 

Abstract 

For  an  economy  with  altruistic  parents  facing  productivity  shocks,  the  optimal  estate 
taxation  is  progressive:  fortunate  parents  should  face  lower  net  returns  on  their  inheri- 
tances. This  progressivity  reflects  optimal  mean  reversion  in  consumption,  which  ensures 
that  a  long-run  steady  state  exists  with  bounded  inequahty — avoiding  immiseration. 


Introduction 

Arguably,  the  biggest  risk  in  life  is  the  family  one  is  born  into.  In  particular,  newborns  partly 
inherit  the  luck,  good  or  bad,  of  their  parents  and  ancestors,  passed  on  by  the  wealth  accu- 
mulated within  their  dynasty.  This  makes  them  concerned  not  only  with  their  own  uncertain 
skills  and  earning  potential,  but  also  with  that  of  their  progenitors.  They  value  insurance, 
from  behind  the  veil  of  ignorance,  against  these  risks.  On  the  other  hand,  altruistic  parents 
are  partly  motivated  to  work  because  of  the  impact  their  effort  can  have,  through  bequests, 
on  their  children's  wellbeing.  The  intergenerational  transmission  of  welfare  determines  the 
balance  between  insuring  newborns  and  parental  incentives. 

One  instrument  societies  use  to  regulate  the  degree  of  this  intergenerational  transmission 
is  estate  taxation.  This  paper  examines  the  optimal  design  of  the  estate  tax  by  characterizing 
Pareto  efficient  allocations  in  an  economy  featuring  the  tradeoff  between  incentives  of  parents 
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and  insurance  of  newborns.  Our  main  result  is  that  estate  taxation  should  be  progressive: 
fortunate  parents  should  face  a  higher  marginal  tax  rate  on  their  bequests. 

We  begin  with  a  two-period  Mirrleesian  economy  with  two  generations  linked  by  parental 
altruism;  we  then  extend  our  analysis  to  an  infinite  horizon  economy  similar  to  Atkeson  and 
Lucas  (1995)  and  Albanesi  and  Sleet  (2004).  In  our  simplest  economy,  a  continuum  of  parents 
live  during  the  first  period:  In  the  second  period  each  is  replaced  by  a  single  descendent 
and  parents  are  altruistic  towards  this  child.  Parents  work,  consume  and  bequeath;  children 
simply  consume.^  Following  Mirrlees  (1971),  parents  first  observe  a  random  productivity  draw 
and  then  exert  work  effort.  Both  productivity  and  work  efi'ort  are  private  information;  only 
output,  the  product  of  the  two,  is  publicly  observable.  We  study  the  entire  set  of  constrained 
Pareto  efficient  allocations  and  derive  their  implications  for  marginal  tax  rates. 

For  this  economy,  if  one  assumes  that  the  social  welfare  criterion  coincides  with  the  par- 
ent's expected  utility,  then  Atkinson  and  Stiglitz's  (1976)  celebrated  uniform-taxation  result 
applies:  the  optimal  estate  tax  is  zero.  That  is,  when  no  direct  weight  is  placed  on  the  welfare 
of  children,  income  should  be  taxed  nonlinearly  (as  in  Mirrlees,  1971),  but  bequests  should 
go  untaxed.  This  arrangement  ensures  that  the  intertemporal  consumption  choice  made  by 
parents — trading  off'  their  own  consumption  against  their  child's  consumption — is  undistorted. 
As  a  result,  the  inheritability  of  welfare  across  generations  is  perfect:  a  child's  consumption 
rises  one-for-one  with  that  of  its  parent.  In  effect,  efficiency  dictates  that  altruism  be  exploited 
to  provide  higher  incentives  for  parents,  by  manipulating  their  children's  consumption.  In- 
equality for  the  children's  generation  is  created  as  a  byproduct,  since  their  expected  welfare 
is  of  no  direct  concern.  Indeed,  in  this  economy,  if  parent  were  not  altruistic,  the  children's 
expected  utility  would  be  higher  at  any  efl&cient  allocation. 

While  this  describes  one  efficient  allocation,  the  picture  is  incomplete.  In  this  economy 
the  parent  and  child  are  distinct  individuals,  albeit  linked  through  parental  altruism,  a  form 
of  externality.  Thus,  a  complete  welfare  analysis  requires  examining  the  ex-ante  utility  of 
both  parents  and  children.  Figure  1  depicts  our  economy's  Pareto  frontier  graphically,  which 
is  peaked  because  altruistic  parent's  welfare  decreases  if  the  child  is  made  relatively  too  mis- 
erable. The  allocation  discussed  in  the  previous  paragraph  is  a  particular  point  lying  on  the 
Pareto  frontier:  the  peak  which  maximizes  the  welfare  of  parents;  marked  as  point  A  in  the 
figure.  In  this  paper  we  explore  other  efficient  arrangements  representing  points  lying  on  on 
the  downward  sloping  section  of  the  the  Pareto  frontier,  to  the  right  of  its  peak. 

Away  from  point  A,  a  role  for  estate  taxation  emerges:  efficient  allocations  which  lie  to 
the  right  of  the  peak  can  be  implemented  with  a  simple  tax  system  that  confronts  parents 
with  separate  nonlinear  schedules  for  income  and  estate  taxes.    Our  main  result  is  that  the 


^  Although  some  readers  have  remarked  that  they  find  this  assumption  reahstic,  it  will  be  relaxed  when 
we  extend  the  time  horizon. 
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Figure  1:  Pareto  frontier  between  ex-ante  utility  for  parent,  Vp,  and  child,  Vc 


optimal  estate  tax  schedule  is  convex:  fortunate  high-skilled  parents  face  a  higher  marginal 
tax  rate  on  their  bequests. 

The  intuition  for  this  result  is  that  progressive  estate  taxation  arises  to  insure  children 
against  their  parent's  luck.  The  progressive  estate  tax  lowers  consumption  inequality  within 
the  children's  generation — which  is  desirable  as  long  as  some  weight,  however  small,  is  placed 
on  them  in  the  social  welfare  criterion — while  still  providing  incentives  to  parents.  A  child's 
consumption  still  varies  with  their  parent's,  providing  some  incentives,  but  now  does  so  less 
than  one-for-one,  providing  some  insurance.  In  other  words,  consumption  mean  reverts  across 
generations,  making  the  inheritability  of  welfare  is  imperfect.  The  optimal  progressivity  in 
estate  taxes  reflects  this  mean  reversion:  fortunate  dynasties  must  face  a  lower  net  return  on 
bequests  so  that  they  choose  a  consumption  path  declining  towards  the  mean. 

Our  stark  conclusion  on  the  progressivity  of  estate  taxation  strongly  contrasts  with  the 
theoretical  ambiguity  in  the  shape  of  the  optimal  income  tax  schedule.  A'lirrlees's  (1971) 
seminal  paper  showed  that  for  bounded  distributions  of  skills  the  optimal  marginal  income 
tax  rates  are  regressive  at  the  top  (see  also  Seade,  1982;  Tuomala,  1990;  Ebert,  1992).  More 
recently.  Diamond  (1998)  has  shown  that  the  opposite — progressivity  at  the  top — is  true  with 
certain  unbounded  skill  distributions  (see  also  Saez,  2001).  In  contrast,  our  results  on  the 
progressivity  of  the  estate  tax  do  not  depend  on  any  assumptions  regarding  the  distribution 
of  skills. 

We  then  extend  our  analysis  for  the  two-period  setup  to  an  infinite-horizon  economy  with 
non-overlapping  generations.  This  extension  is  important  for  at  least  two  reasons. 

First,  it  provides  a  motivation  for  studying  efficient  allocations  which  do  not  simply  maxi- 
mize the  expected  utility  of  the  very  first  generation — the  analogues  of  point  A  from  Figure  1 


for  the  infinite-horizon  economy.  Indeed,  these  allocations  have  everyone  in  distant  gener- 
ations converging  to  misery,  with  zero  consumption.  This  is  a  version  of  the  immiseration 
result  shown  by  Atkeson  and  Lucas  (1992)  for  a  taste  shock  economy,  which  we  extend  here  to 
the  Mirrleesian  economy.  Loosely  speaking,  if  we  continue  to  plot  the  expected  utility  of  the 
last  generation  on  the  horizontal  axis,  then  as  we  extend  the  horizon  point  A  moves  further 
and  further  to  the  left,  to-«^ards  misery.  This  provides  a  rationale  for  focusing  on  efficient 
allocations  that  place  positive  weight  on  future  generations,  the  analogues  of  the  downward 
sloping  section  of  the  Pareto  frontier,  to  the  right  of  point  A  on  the  figure.  However,  as  we 
show  here  by  extending  the  analysis  in  Farhi  and  Werning  (2005),  this  result  is  special,  by 
placing  weight  on  future  generations,  so  that  the  social  discount  factor  is  greater  than  the 
private  one,  a  steady  state  exists,  misery  is  avoided  and  there  is  social  mobility. 

Second,  the  infinite  horizon  version  of  our  model  allows  us  to  make  contact  with  a  grow- 
ing literature  on  dynamic  Mirrleesian  models,  such  as  Golosov,  Kocherlakota  and  Tsyvinski 
(2003),  Albanesi  and  Sleet  (2004)  and  others  (see  further  references  in  Golosov,  Tsyvinski  and 
Werning,  2006).  In  our  model  each  individual  lives  for  a  single  period,  observes  a  productivity 
draw  and  works,  and  is  then  replaced  by  a  single  descendant  in  the  next  period.  As  usual, 
perfect  altruism  implies  that  each  dynasty  behaves  as  a  single  infinitely-lived  individual,  so 
our  model  environment  is  identical  to  Albanesi  and  Sleet  (2004).  However,  our  intergener- 
ational  interpretation  of  the  infinite  horizon  leads  us  to  study  a  different  planning  problem, 
one  that  puts  direct  weight  on  the  expected  utility  of  future  generations,  or  equivalently,  one 
that  has  a  social  discount  factor  that  is  higher  than  the  private  one.  Indeed,  to  avoid  the 
immiseration  result  mentioned  above,  Albanesi  and  Sleet  impose  an  ad  hoc  lower  bound  on 
continuation  utility  along  the  equilibrium  path;  in  contrast,  our  steady-state  analysis  requires 
no  such  lower  imposition. 

The  progressivity  of  estate  taxes  extends  to  this  infinite  horizon  setup:  fortunate  parents 
face  a  higher  average  marginal  tax  rate  on  their  bequests.  Indeed,  the  average  marginal  estate 
tax  rate  formula  is  the  same  as  in  the  two-period  economy.  The  main  difference  between 
the  two-period  and  infinite  horizon  economies  is  that  tax  implementations  are  more  involved 
in  the  latter.  We  adapt  Kocherlakota's  (2004)  implementation,  which  yields  a  marginal  tax 
estate  rate  that  is  zero,  on  average,  for  all  parents  when  only  the  first  generation  is  weighed 
in  the  welfare  criterion,  the  analogue  of  point  A. 

Throughout  this  paper,  we  study  an  economy  without  capital,  where  aggregate  consump- 
tion equals  aggregate  produced  output  plus  an  endowment.  This  no-aggregate-savings  as- 
sumption allows  us  to  focus  on  redistribution  within  generations  and  abstract  from  transfers 
across  generations.  Unfortunately,  it  does  not  allow  us  to  pin  down  the  level  of  estate  taxa- 
tion, only  its  shape.  Farhi,  Kocherlakota  and  Werning  (2005)  extend  this  model  among  several 


dimensions — including  capital  accumulation,  life-cycle  elements  and  general  skill  processes — 
and  show  that  our  main  result  on  progressive  estate  taxation  is  insensitive  to  this  assumption. 

Our  work  relates  to  a  number  of  recent  papers  that  have  explored  the  implications  of 
including  future  generations  in  the  social  welfare  criterion.  Phelan  (2005)  considered  a  social 
planning  problem  that  weighted  all  generations  equally,  which  is  equivalent  to  not  discounting 
the  future  at  all.  Farhi  and  Werning  (2005)  considered  intermediate  cases,  where  future 
generations  receive  a  geometrically  declining  weight.  This  is  equivalent  to  a  social  discount 
factor  that  is  less  than  one  and  higher  than  the  private  one.  Sleet  and  Yeltekin  (2005)  have 
studied  how  such  a  higher  social  discount  factor  may  arise  from  a  utilitarian  planner  without 
commitment.  None  of  these  papers  consider  implications  for  estate  taxation. 

We  organized  the  rest  of  the  paper  in  the  following  way.  Section  1  describes  the  two  period 
model  environment  and  Section  2  introduces  the  associated  planning  problem.  Our  main 
results  for  this  two-period  economy  are  in  Section  3.  In  Section  4  we  describe  the  extension 
to  an  infinite  horizon.  The  main  results  for  that  economy  are  contained  in  Section  5.  We  use 
Section  6  for  concluding  remarks. 

1      Parent  and  Child:  A  Two  Period  Economy 

There  are  two  periods  labelled  t  =  0, 1.   The  parent  lives  during  ^  =  0  and  is  replaced  by  a 
single  child  at  t  =  1.  The  parent  works  and  consumes,  while  the  child  only  consumes.  Thus, 
an  allocation  is  a  triplet  of  functions  {co{wo) ,  Ci{wo) ,  yoiuJo)) ,  where  cq  and  yo  represents  the 
parent's  consumption  and  output,  and  Ci  represents  the  child's  consumption. 
The  parent  is  altruistic  towards  the  child 


i;o  =  E 


ti(co)  -  hi  —  )  +  (3vi 

Wo 


(1) 


where  the  expectations  is  over  wq  and  (3  <  1.  The  child's  utility  is  simply 

vi  =  li(ci)  (2) 

The  utility  function  u{c)  is  increasing,  concave  and  differentiable;  the  disutility  function  h{n) 
is  assumed  increasing,  convex  and  differentiable. 

Substituting  equation  (2)  into  equation  (1)  yields  the  alternative  expression  for  the  parent's 
utility: 


?;o  =  E 


u(co)  +  Pu{ci)  -  h[  — 

Wo 


(3) 


As  usual,  the  parent's  expected  utility  can  be  reinterpreted  as  that  of  a  fictitious  dynasty  that 


lives  for  two  periods  and  discounts  at  rate  /5. 

Following  (Atkeson  and  Lucas,  1992)  and  others,  we  abstract  from  capital  accumulation 
to  concentrate  on  the  distributional  assignment  of  goods  across  agents  within  a  period,  and 
not  over  time.  An  allocation  is  resource  feasible  if  aggregate  consumption  in  both  periods  is 
not  greater  than  the  sum  of  endowments  and  production; 

POO  roo 

/     Co{wo)dF{wo)  <  eo  +  /     yoiwo)dF{wo)  (4) 

JO  -JO 

ci{wo)dF{wQ)  <  ei  (5) 


Productivity  is  private  information  so  incentives  need  to  be  provided  for  truthful  revelation. 
We  say  that  an  allocation  is  incentive  compatible  if  the  parent  finds  it  optimal  to  reveal  her 
shock  truthfully: 

u{co{wo))  +  Pu{c,{wo))  -  h  (y^^)  >  u{co{w))  +  pu{c,{w))  -  h  (y^)  (6) 

for  all  productivity  realizations  wq- 

2     Social  Welfare  and  Efficient  Allocations 

To  study  all  constrained  efficient  allocations  for  the  two-period  economy  it  is  useful  to  work 
with  the  general  welfare  criterion 

W  =  Vo  +  aEvi,  (7) 

which  places  some  weight  a  >  0  on  the  expected  utility  of  children.    As  we  vary  a  we  can 
trace  out  the  entire  Pareto  frontier,  since  the  latter  is  convex,  as  illustrated  in  Figure  1. 
Substituting  equjitions  (2)  and  (3)  into  equation  (7)  implies  the  alternative  expression 

W  =  E[u{co)  +  {P  +  a)u{ci)-h{yo/wo)]. 

Thus,  the  social  welfare  function  is  equivalent  to  the  parent's  preferences  but  with  a  social 
discount  factor  /3  =  P  +  a  that  is  higher  than  the  private  one  as  long  as  a  >  0. 

The  planning  problem  maximizes  the  welfare  criterion  W  over  allocations  that  are  resource 
feasible  and  incentive  compatible.  Formally,  the  problem  is 

/>oo 

max    /     [u{co{wo))  + Puiciiwi))  -  h{yo{wo)/wo)]dF{wo) 

co,ci,yo  7q 

subject  to  the  resource  constraints  in  equations  (4)-(5)  and  the  incentive  compatibility  con- 


straints  in  equation  (6). 

It  is  useful  to  divide  the  planning  problem  into  two  stages.  In  the  first  stage  the  planner 
chooses  the  profile  of  output  yo{wo)  and  a  schedule  of  incentives  A(zi;o),  which  is  equal  to 
utihty  from  consumption  u{c{){wo))  +  Pu{ci{wo))  up  to  a  constant.  In  the  second  stage,  the 
planner  solves  the  subproblem  of  how  best  to  provide  the  incentives  A{wq),  using  cq{wo)  and 
Ci(uio).  The  key  feature  is  that  the  second  stage  involves  no  incentive  constraints,  these  are 
imposed  in  the  first  stage.  Formally,  by  introducing  A  and  U_  the  full  problem  can  be  written 
as 


max       /     [u{co{wo))+ Pu{ci{wi))  -  h{yo{wo)/wo)]dF{wo) 

co,ci,yo,A,C/ 7o 

subject  to  A{w)  +  U_  =  u{co{'Wo))  +  Pu{ci{wo)),  the  resource  constraints  in  equations  (4)-(5) 
and  the  incentive  compatibility  constraints  A{wo)  —  h{y{wo) / wq)  >  A{w)  —  h{y{w)/wo)  for 
all  wq.  Note  that  the  incentive  constraint  does  not  involve  Cq,  Ci  or  U_;  only  A  and  t/o- 

For  our  purposes,  it  suffices  to  focus  on  the  second  stage  that  takes  A  and  yo  as  given, 
which  allows  us  to  drop  the  incentive  constraint: 

/>oo 

max    /     [u{cq{wo))  +  Pu{ci{wi))]dF{wo) 

co,ci,UJq 

subject  to  A{wq)  +  [/  =  u{co{wo))  +  /3u{ci{wi))  and  the  resource  constraints  in  equations  (4)- 

(5). 

It  is  convenient  to  rewrite  this  problem  by  changing  variables,  from  consumption  to  utility 
assignments  Uo{'w)  =  u{co{w))  and  Ui{w)  =  u{ci{w)),  since  then  the  objective  is  then  linear 
and  the  constraints  strictly  convex.  After  substituting  Uq{wq)  =  A{wo)  +  U_  —  PUi{wi)  out 
the  problem  becomes 

/•CX) 

max/     [U  +  0-(3)Ui{w,)]dF{wo) 
■subject  to 

/•oo  />oo 

/      C{A{wo)  +  U-  f3Ur{w,))dF{wo)  <  Cq  +  /      yo{wo)dF{wo) 
Jo  Jo 

C{U,{wo))dF{wo)  <  ei 


'0 

It  is  easy  to  see  that  both  resource  constraints  must  bind  at  an  optimum. 


3     The  Main  Result:  Progressive  Estate  Taxation 

In  this  section  we  derive  two  main  results  for  the  two-period  economy  laid  out  in  the  previous 
section.  We  first  show  that  implicit  marginal  tax  rates  on  bequests  must  be  progressive. 
We  then  provide  a  simple  tax  implementation  that  relies  on  two  separate  schedules  for  labor 
income  and  estates. 

3.1     Implicit  Marginal  Taxes 

For  any  allocation  and  constant  R>  0  we  can  define  the  associated  marginal  tax  rates  t{wo) 
solving  the  Euler  equation 

Below,  the  constant  R  plays  the  role  of  the  pre-tax  gross  interest  rate.  Since  our  economy  has 
no  savings  technology,  this  value  is  not  uniquely  pinned  down  in  equilibrium — it  is  completely 
unimportant  for  anything  that  follows.  Different  values  of  R  are  associated  with  different 
levels  for  the  tax,  but  they  do  not  affect  its  shape. 

The  first-order  condition  for  Ui{wo),  which  is  necessary  and  sufficient  for  optimality,  is 

P-P  +  l3XoC'{Uo{wo))  =  XiC'{Ui{wo)). 

where  At  is  strictly  positive  lagrange  multiplier  on  the  resource  constraint  for  period  t.  From 
this  equation  it  follows  that  Uo{wo)  and  Ui{wo)  move  in  the  same  direction  with  Wq.  Since 
Uo{iL>o)  +  /3Ui{wo)  must  be  increasing,  in  order  to  provide  incentives,  it  follows  that  both 
Uo{wo)  and  Ui{wq)  are  increasing;  hence,  both  consumptions  Coiwo)  and  Ci{wq)  are  increasing 
in  Wq. 

Using  the  fact  that  C{u)  is  the  inverse  of  u(c),  so  that  C'{Ut{wo))  =  I / u' {ci{iuo)) ,  and 
rearranging  we  obtain 

\  u'{co{wo))\  u'{ci{wo)) 

J         Ao        J  u'ic^iwo))-  ^^ 

From  the  first  order  condition  for  U_  it  follows  that  I/Aq  =  J^{l/u'{co{w)))dF{w).  For  what 
follows  we  normalize  so  that  R  =  Aq/Ai. 

Our  first  result,  derived  from  ecjuation  (9)  when  J3  =  j3,  can  be  viewed  as  simply  restating 
the  celebrated  Atkinson-Stiglitz  uniform  taxation  result  for  our  economy. 

Proposition  1.  When  (5  =  (5  the  optimal  allocation  implies  a  zero  marginal  estate  tax  rate: 
t{wq)  =  0  in  equation  (8)  and  the  marginal  rate  of  substitution  u' [ci{wq))  /  u' {cq{wq))  is  equated 
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across  all  dynasties,  i.e.  for  all  wq. 

Atkinson  and  Stiglitz  (1976)  showed  that,  provided  preferences  over  a  group  of  goods  is 
separable  from  work  effort,  then  consumption  within  this  group  should  not  be  distorted.  In 
other  words,  the  imphed  marginal  taxes  for  these  goods  should  be  equahzed  to  avoid  distorting 
their  relative  consumption — uniform  taxation  is  optimal.  In  our  context,  this  result  applies 
to  consumption  at  both  dates,  cq  and  Ci,  and  implies  that  the  ratio  of  marginal  utilities  is 
equalized  across  agents — the  estate  tax  can  be  normalized  to  zero.' 

In  contrast,  whenever  /3  >  /?  equation  (9)  implies  that  the  ratio  of  marginal  utilities 
is  not  equalized  across  agents:  there  must  be  some  distortion,  so  the  marginal  estate  tax 
cannot  be  zero.  Indeed,  since  consumption  increases  with  productivity  estate  taxation  must 
be  progressive. 

Proposition  2.  When  $  >  8  the  optimal  allocation  implies  a  nonzero  and  progressive  marginal 
estate  tax:  r{wo)  ^  0  for  all  wq  and  t{wo)  is  increasing  in  wq.  For  R  =  Q  the  marginal  tax 
rate  is 

r{wo)  =  -0lf3-l)u'{co{w,))  n^u'{co{w))-'dF{w)]  (10) 

and  Co{wo),  Ci{wo)  and  yo(^o)  fl^^e  increasing  in  wq. 

We  emphasize  that  the  interesting  implication  for  the  tax  rate  here  is  that  it  increases 
with  productivity:  taxation  is  progressive.  Without  an  aggregate  savings  technology  the 
overall  level  of  estate  tax  cannot  be  uniquely  pinned  down,  it  is  completely  irrelevant.  Farhi, 
Kocherlakota  and  Werning  (2005)  extends  the  analysis  to  an  economy  with  capital,  which 
pins  down  the  level  of  estate  taxation. 

3.2     A  Simple  Tax  Implementation 

We  next  show  that  we  can  implement  efficient  allocations,  and  the  progressive  implicit  marginal 
tax  rates  that  go  with  them,  with  a  simple  tax  system.  In  our  implementation,  the  government 
confronts  parents  with  two  separate  schedules:  an  income  tax  and  an  estate  tax.  We  say  that 
an  allocation  is  implementahle  by  non-linear  income  and  estate  taxation  Tf{yo),  T2  and  T^{b) 
if,  for  all  Wo,  the  allocation  {co{wo) ,  Ci{wq) ,  yo{wo))  solves  ' 

max  {u{co)  +  Pu{ci)  -  h{yo/wo)} 

co,ci,yo 


■^  One  difference  is  that  Atkinson  and  Stiglitz  (1976)  assume  a  linear  technological  transformation  be- 
tween goods,  whereas  we  assume  no  possible  transformation.  Their  result  on  uniform  taxation  implies  that 
marginal  rates  of  substitution  are  equalized  across  agents  and  that  they  are  all  equal  to  the  marginal  rate  of 
transformation.  Our  result  only  emphasizes  the  former. 


subject  to 

co  +  h=yo-T\h)-Tf{yo), 
c,  =  Rh,  +  y2-Tl 

It  is  trivial  to  change  things  so  that  it  is  the  child  that  pays  the  estate  tax  at  i  =  1. 
Furthermore,  without  loss  of  generality  we  can  assume  that  2/2  —  T^  —  0.  To  see  this,  define 
bi  =  bi  +  (y2  -  T2)/R  then 

C  +  b^  =  yo-T\k-{y2-T2)/R)-Tf{yo)-T^{yo) 
=.y,-f\k)-fy{yo) 

where  f^yo)  =  Tf  (yo)  +  T|(yo)  and  f'ih)  =  T\k  -  (yz  -  T2)/R). 

Our  next  result  establishes  formally  that  efficient  aUocations  can  be  implemented  with 
separate  nonlinear  income  and  estate  taxation.  The  idea  is  to  define  T^{b)  so  that 

1 

=  1  —  t{w) 


The  proof  then  exploits  the  fact  that  marginal  tax  rates  are  progressive  to  ensure  that  the 
bequest  problem  faced  by  the  parent  is  convex. 

Proposition  3.  Suppose  cq{wq),  Ci{wo),  yo{'Wo)  and  t{wo)  are  increasing  functions.    Then 
there  exists  tax  functions  T'^{y)  andT''{b)  that  implements  this  allocation,  with  T'^{b)  convex. 

Proof.  Use  the  generalized  inverse  of  Ci{w),  where  possible  flat  portions  of  Ci{w)  define  dis- 
continuous jumps,  to  define 

""'-'^l-.((c,)-(o))-'  <'^' 

and  normalize  so  that  r^(0)  =  0.    Note  that  by  the  monotonicity  of  t{w)  and  Co{uj),  the 
function  T^{b)  is  convex.  Next  define  net  income 


I{wo)  =  Co{wo)  +  R-^Ci{wo)  +  T\ci{wo)) 


We  can  express  this  in  terms  of  output  y  by  using  the  inverse  of  yo{wo):  P{y)  =  I{yo  ^(y))- 
Then  we  let  Ty{yo)  =  yo  —  /^(yo)-  Finally,  let  the  consumption  allocation  as  a  function  of  net 
income  /  be:  (co(/),  Ci(/))  =  {coiI-\l)),c^{I-\l))). 

We  now  show  that  the  constructed  tax  functions,  T'^{y)  and  T^{b),  implement  the  alloca- 
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tion.  For  any  given  net  income  /  the  consumer  solves  the  subproblem: 

V{I)  =  max{u(co)  +  pu{ci)} 

subject  to  Co  +  R~^ci  +  T''(c])  <  /.  This  problem  is  convex,  the  objective  is  concave  and  the 
constraint  set  is  convex,  since  T''  is  convex.  It  follows  that  the  first-order  condition 

1  +  T'"{b)  u'{cfy) 

sufficient  for  optimality.  Combining  equation  (8)  and  equation  (11)  it  follows  that  these 
conditions  for  optimality  are  satisfied  by  Co(/),Co(/)  for  all  /.  Hence  V(I)  =  u{cq{I))  + 
/3u(co(/)). 

Next,  consider  the  worker's  maximization  over  j/o  given  by 

max{ViI{y))-h{y/wo)}. 
y 

We  need  to  show  that  yo(^f^o)  solves  this  problem,  which  implies  that  the  allocation  is  im- 
plemented since  consumption  would  be  given  by  Co(/(j/o('"-'o)))  =  Co(rfo)  and  Ci(/(yo(ii'o)))  = 
Ci{wq).  Now,  from  the  previous  paragraph  and  our  definitions  it  follows  that 

yo{wo)  e  argmax{y(/(y))  -  h{y;wo)} 
y 

O  yoiwo)  e  argmax{u(co(/(y)))  -l-/3«(ci(/(y)))  -  h{y/wo)} 
y 

•^  Wo  e  argmax{u(co(i(j))  -I-  Pu{ci{w))  -  h{yo{iu) / wq)} 

w 

Thus,  the  first  line  follows  from  the  last,  which  is  guaranteed  by  the  assumed  incentive 
compatibility  of  the  allocation,  equation  (6).  Hence,  yoiwo)  is  optimal  and  it  follows  that 
{cQ{wo),Ci{wQ),yo{wQ))  is  implemented  by  the  constructed  tax  functions.  D 

3.3     Discussion 

Without  estate  taxation  there  is  perfect  inheritability  of  welfare.  In  particular,  consumption 
of  parents  and  child  move  in  tandem,  one-for-one.  This  situation  is  only  optimal  when  the 
children  are  not  considered  independently  in  the  welfare  criterion,  so  that  insuring  them, 
against  the  risk  of  their  parent's  fortune  is  not  valued. 

In  contrast,  when  insurance  is  provided  to  the  children's  generation  their  consumption 
still  varies  with  their  parent's,  but  less  than  one-for-one.  The  intergenerational  transmission 
of  welfare  is  imperfect:   consumption  mean  reverts  across  generations.    The  progressivity  of 
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the  estate  tax  schedule  reflects  this  mean  reversion.  Fortunate  parents  must  face  a  lower  net 
returns  on  bequests  in  order  to  give  them  incentives  to  tilt  their  consumption  towards  the 
present,  that  is,  towards  themselves.  Likewise  poorer  parents  need  to  face  higher  net  returns 
so  that  their  consumption  slopes  upward.  This  explains  the  progressivity  of  estate  taxes. 

Another  intuition  is  based  on  the  interpretation  of  altruism  as  a  form  of  externality.  In 
the  presence  of  externalities',  some  form  of  corrective  Pigouvian  taxes  are  generally  desirable. 
Think  of  a -parental  bequest  as  a  consumption  good  with  a  positive  externality  to  the  child; 
then  the  Pigouvian  logic  implies  that  we  should  subsidize  bequests.  Since  expected  utility  is 
our  concern,  and  utility  is  concave,  this  externality  is  greatest  for  children  with  low  consump- 
tion. Thus,  the  subsidy  rate  should  be  highest — or  equivalently,  the  negative  tax  should  be 
lowest — for  poor  parents.  Optimal  estate  taxation  is  thus  progressive.  Since  our  economy  has 
no  capital,  the  Pigouvian  level  of  taxation  turns  out  to  be  irrelevant — we  may  tax  or  subsidize 
estates.  However,  the  relative  tax  conclusion  in  this  argument  remains  robust. 

None  of  these  arguments  require  the  private-information  structure.  However,  if  productiv- 
ity or  effort  were  observable,  then  the  first-best  allocation  would  be  achievable.  Consumption 
and  wealth  would  then  be  equated  across  parents.  Although  one  can  still  think  of  a  progres- 
sive estate  tax  in  this  situation  for  out-of-equilibrium  levels  of  parental  wealth,  it  becomes 
irrelevant  given  the  lack  of  parental  inequality.  In'this  sense,  our  results  rely  on  an  interaction 
between  redistributive  and  corrective  motives  for  taxation  (see  also  Amador,  Angeletos  and 
Werning,  2005). 

4     A  Mirrleesian  Economy  with  Infinite  Horizon 

We  now  turn  to  a  repeated  version  of  this  economy  with  an  infinite  horizon,  as  in  Albanesi  and 
Sleet  (2004).  All  generations  work  and  receive  a  random  productivity  draw.  An  individual 
born  into  generation  t  has  ex-ante  welfare  Vt  solving 

vt  =  Et-i[u{ct)  -  h{nt)  +  Pi't+i], 

where  /3  <  1  is  the  coefficient  of  altruism.  We  assume  that  the  utility  function  over  consump- 
tion satisfies  the  Inada  conditions  u'{0)  =  oo  and  u'{oo)  =  0.  We  adopt  a  power  disutihty 
function  h{n)  =  rC  j^  with  7  >  1  to  ensure  that  the  planning  problem  is  convex. 

An  individual  with  productivity  w,  exerting  work  effort  n,  produces  output  y  =  w  ■  n. 
Utility  can  then  be  written  as 

00 
Vt  =  Y.l^'  ^*-i  [^(^+«)  -  ^t+sKvt+s)]  (12) 

s=0 
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where  6t  =  w^'^  can  be  interpreted  as  a  taste  shock  to  producing  output.  Productivity  Wt, 
and  hence  9t,  is  independently  and  identically  distributed  across  dynasties  and  generations 
t  =  0,1 . . .  With  innate  talents  assumed  nonheritable,  intergenerational  transmission  of  welfare 
is  not  mechanical  linked  through  the  environment  but  may  arise  to  provide  incentives  for 
altruistic  parents. 

Since  productivity  shocks  are  assumed  to  be  privately  observed  by  individuals  and  their 
descendants  each  dynasty  faces  a  sequence  of  consumption  functions  {q},  where  Ct{9'')  repre- 
sents an  individual's  consumption  after  reporting  the  history  9*  =  {9o,  9i, . . . ,  9t).  A  dynasty's 
reporting  strategy  a  =  {at}  is  a  sequence  of  functions  at  :  0*+^  -^  Q  that  maps  histories  of 
shocks  9^  into  a  current  report  9t.  Any  strategy  a  induces  a  history  of  reports  o"*  :  G'"*"^  —<■  0'+^ 
We  use  a*  to  denote  the  truth-telling  strategy  with  at  {9^)  =  9t  for  all  6**  G  0'+^ 

Given  an  allocation  {q},  the  utility  obtained  from  any  reporting  strategy  a  is 

U{{ct},a;P)^Yl    E    P'H^t{'^'iS')))-(^Myi'^\0')))]Fii9'). 

An  allocation  {q}  is  incentive  compatible  if  truth-telling  is  optimal,  so  that 

U{{c,},a*;P)>U{{c,],a-f3)  (13) 

for  all  strategies  a. 

We  identify  dynasties  by  their  initial  utility  entitlement  vq  with  distribution  ip  in  the 
population.  An  allocation  is  a  sequence  of  functions  {ct,yt}  for  each  v,  where  Cj(^*)  and 
yl{9^)  represents  the  consumption  and  income  that  a  dynasty  with  initial  entitlement  v  gets 
at  date  t  after  reporting  the  sequence  of  shocks  9^.  For  any  given  initial  distribution  of 
entitlements  'ip  and  resources  e,  we  say  that  an  allocation  (cj"}  is  feasible  if:  (i)  it  is  incentive 
compatible  for  all  dynasties;  (ii)  it  delivers  expected  utility  of  v  to  all  initial  dynasties  entitled 
to  v\  and  (iii)  average  consumption  in  the  population  does  not  exceed  the  fixed  endowment  e 
plus  income  generated  in  all  periods: 

lY,c\{9')Y>x{9')di,{v)<e+  j  Y,yl{9')?v{9')diP{v)        t  =  0,l,...  (14) 

Consider  the  sum  of  expected  utilities  weighted  by  geometric  Pareto  weights  at  =  /3* 

oo  /  1      \  1         °° 

Y^at¥._,Vt=[l-^—)v,  +  -^—T^'¥..,[u{c,)-9th[yt)].  (15) 

t=o  ^        P      y/  P      P  i=o 

with  P  >  p.  The  first  term  is  exogenously  given,  since  we  take  as  given  a  distribution  for  the 
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initial  utility  entitlements  Vq-  Thus,  the  welfare  criterion  is  given  by 

so 

J2P'^-iH^)-^tHyt)]  (16) 

Future  generations  are  already  indirectly  valued  through  the  altruism  of  the  current  genera- 
tion. If,  in  addition,  they  are  also  directly  included  in  the  welfare  function  the  social  discount 
factor  must  be  higher  than  the  private  one  (see  Faiiii  and  Werning,  2005,  for  more  details). 

When  P  =  /3,  the  planning  problem  seeks  the  lowest  constant  resource  level  e  to  ensure  that 
there  exists  a  feasible  allocation  that  delivers  the  distribution  of  utility  entitlements  ip.  This  is 
precisely  the  efficiency  problem  studied  in  Albanesi  and  Sleet  (2004).  WTien  ,5  >  /3  we  define 
the  social  optimum  as  maximizing  the  average  social  welfare  function  (16),  weighed  by  ib,  over 
all  feasible  allocations.  That  is,  the  social  planning  problem  given  an  initial  distribution  of 
entitlements  ip  and  an  endowment  level  e  is  to  maximize 

Ju{{cn,a*,,d)d^{v)  (17) 

subject  to  the  the  resource  constraints  (14),  as  well  as  the  promise  keeping  and  incentive 
constraints:  v  =  [/({cJ'},cr*;/3)  and  U{{c^},a*\p)  >  U{{c^},a;l3)  for  all  initial  entitlements  v 
and  strategies  a. 

We  are  interested  in  distributions  of  utility  entitlements  ip  such  that  the  solution  to  the 
planning  problem  features,  in  each  period,  a  cross-sectional  distribution  of  continuation  utili- 
ties vt  that  is  also  distributed  according  to  ijj.  We  also  require  the  cross-sectional  distribution 
of  consumption  and  income  to  replicate  itself  over  time.  We  term  any  initial  distribution  of 
entitlements  with  these  properties  a  steady  state  and  denote  them  by  ip*.  Follo^vang  Farhi  and 
Werning  (2005),  we  approach  the  planning  problem  by  studying  a  relaxed  version  of  it.  The 
solutions  to  both  problems  coincide  for  steady  state  distributions  ib*,  which  is  all  we  seek  to 
characterize.  The  relaxed  problem  has  continuation  utility  as  a  state  variable  that  follows  a 
Markov  process.  Steady  states  are  then  invariant  distributions  of  this  Markov  process. 

Define  the  relaxed  planning  problem  to  be  equivalent  to  the  social  planning  problem  except 
that  the  sequence  of  resource  constraints  (14)  is  replaced  by  the  single  intertemporal  condition 

r  ^  1 

•^    t=o        e'  1      P 

Letting  A  be  the  multiplier  for  this  intertemporal  resource  constraint  we  form  the  Lagrangian 
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L  =  J  L"  dib{v)  where 

^''  =  E  E-^'("(<(^*))  -  ^<(^*)  -  ^Myti^'))  +  %(^'))  Pr(^*)  (19) 

t=o   e« 

and  study  the  maximization  of  L  subject  to  v  =  L'({(^'},o"*;,i3)  >  L^({cr},cr;/3)  for  all  v  and 
cr.  For  any  endo-Rinent  level  e.  there  exists  a  unique  positive  multiplier  A(e)  so  that  the 
maximizing  this  Lagrangian  is  equi-valent  to  solving  the  relaxed  problem.  Maximizing  L  is 
equi%'alent  to  the  point^ise  optimization,  for  each  i".  of  the  subproblem: 

k{v)  =  snpU  (20) 

subject  to  V  =  t/"({c«)}, o-*; /?)  >  U{{c{i4)},a;,3)  for  aU  a. 

The  \'alue  function  of  the  component  planning  problem  k{v)  defined  by  equation  (20)  is 
continuous,  concave,  and  satisfies  the  BeUman  equation 

^(lO  =  maxE[u(^)  -  Ac(«(^))  -  9h{9)  -\-  Xy{h{e))  -F  3k{w{e))]  (21) 

subject  to  -    -^ 

V  =  E[u{e)  -  dh{e)  -F  pw{9)]  (22) 

u{d)-dh{d)+pw{e)>u{e')-eh{e')+dw{9')      forau  e.o'ee.  (23) 

Denote  by  g^{v,6)  and  g^{v,9)  the  optimal  policy  function  for  w  and  u.   The  next  lemma 
characterizes  some  key  properties  of  the  value  function  k{v). 

Lemma  1.  JTie  value  function  k{v)  is  strictly  concave  and  continuously  differentiable  on  {v,v) 
where  v  —  — oo,-  it  is  unbounded  below  on  both  sides  limj^,^t^A;(i')  =  lim,.— d  ^(i.')  =  — oo:  and 
the  derivative  has  lim^.^j,  A:'(r)  =  1  and  limi^,^f  k'{v)  =  — oo. 

5      Steady  States  aiid  Progressive  Taxation 

We  are  interested  in  steadj-  state  distributions  v'  that  have  no  mass  at  miserj-  v.  Our  first 
result  is  that  this  is  not  possible  when  future  generations  are  not  weighed  directly,  so  that 
l3  =  ,3.  We  then  show  that,  in  contrast,  whenever  3  >  i3  a  steady  state  distribution  exists 
with  no  mass  at  misery.  The  efficient  allocation  displays  a  form  of  mean  rex^ersion  across 
generations  that  keeps  inequaUty  bounded.  The  mean  reversion  is  characterized  by  a  modified 
inverse  Euler  equation  which  implies  that  estate  taxation  is  progressive. 
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5.1     An  Immiseration  Result 

For  /3  =  P,  we  have  to  modify  our  definition  for  the  Social  Planning  problem.  For  any 
distribution  ip  of  initial  welfare  entitlements,  the  planning  problem  is  to  minimize  the  net 
resources  required  to  deliver  the  utility  entitlements  in  an  incentive  compatible  way: 


subject  to, 


inf  e  (24) 


j  Yl^dl{0')  -  yl{e'))di>{v)  <  e  (25) 


e« 


[/({cj'},a;/?)  =  i;  for  ally  (26) 

.     ■       t/({c;'},a*;/3)  >  C/({cj^},a;;3)  for  ally  andd  (27) 

From  this  program,  we  can  define  an  invariant  distribution  exactly  as  in  Section  4  of  the 
paper.  We  are  interested  in  steady  state  distributions  -0*  without  full  mass  at  misery.  Our 
first  result  is  that  this  is  basically  not  possible  when  (i  =  j3. 

Proposition  4.  Suppose  that  limu^ooSupc"(u)/c'(ii)  <  oo.    Then  if  §  =  P,  there  exists  no 
invariant  distribution  ip*  without  full  mass  at  misery. 

This  result  extends  the  immiseration  result  in  Atkeson  and  Lucas  (1992),  who  study  an 
endowment  economy  with  privately  observed  taste  shocks,  instead  of  the  Mirrleesian  pro- 
duction economy  with  privately  observed  productivity  shocks  studied  here.  They  show  that 
the  cross-sectional  distribution  of  consumption  disperses  steadily  over  time,  with  inequality 
growing  without  bound.  As  a  result,  almost  everyone  converges  to  the  misery,  consuming 
nothing,  while  a  vanishing  fraction  tend  towards  bliss,  consuming  the  entire  aggregate  en- 
dowment. Thus,  no  steady  state  distribution  with  positive  consumption  exists.  To  the  best 
of  our  knowledge  Proposition  4  is  the  first  formal  statement  of  an  analogous  result  in  the 
context  of  a  Mirrleesian  economy,  where  private  information  is  regarding  productivity  shocks. 
Researchers  that  assume  (3  =  (3  have  been  typically  forced  to  impose  an  ad  hoc  lower  bound 
on  continuation  utility  to  avoid  misery  and  ensure  that  an  steady-state  distribution  exists 
(Atkeson  and  Lucas,  1995;  Albanesi  and  Sleet,  2004). 

5.2      Steady  States  and  a  Modified  Inverse  Euler  Equation 

We  now  return  to  efficient  allocations  where  future  generations  are  given  positive  weight. 
We  first  derive  an  important  intertemporal  condition  that  must  be  satisfied  by  the  optimal 
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allocation.  This  condition  has  interesting  impHcations  for  the  optimal  estate  tax,  computed 
later. 

Let  A  be  the  multiplier  on  the  promise-keeping  constraint  and  let  ii{9,9')  represent  the 
multipliers  on  the  incentive  constraints.  Then  the  first-order  conditions  for  interior  solutions 
for  u{d)  and  w{6)  are 

p{e)  -  Xc'{u{6M9)  -  Xp{B)  -  J2  ^(^'  ^')  +  E  ^(^''  ^)  =  0  (28) 

6'  e' 

Pk'  {wie))p{e)  -  pxp{e)  -/3J2  a^(^>  ^')  +  /?  E  z^^^'-  ^)  =  o  (^9) 

8'  6' 

The  envelope  condition  is  k'  {v)  =  A.  From  the  first-order  condition  for  w{9)  we  obtain  the 
CLAR  equation 

^k'{v)  =  Y.^'i9'"iv,d))p{9).  (30) 

P  9 

This  equation  encapsulates  the  mean-reversion  force  in  the  model.  In  sequential  notation 

^k'{v^)  =  Et[k'{vt^,)],  ,  (31) 

so  that  P/P  <  1  acts  as  an  autoregressive  coefficient  ensuring  that  over  time  the  derivative 
k'{vt)  mean  reverts  back  to  zero,  where  the  function  k{v)  finds  its  interior  maximum.  The 
mean-reverting  force  provided  by  /?  >  /3  is  crucial  for  the  existence  of  steady  state  distributions 
with  bounded  inequality,  which  we  prove  below.  In  contrast,  when  /3  =  /3  no  such  central 
tendency  exists,  increasing  inequality  and  immiseration  ensues  and  no  steady  state  exists 
(Proposition  4). 

The  optimal  resolution  of  the  tradeoff  between  incentives  for  altruistic  parents  and  in- 
surance for  newborns  gives  rise  to  a  less  than  one-for-one  intergenerational  transmission  of 
welfare — in  contrast  to  the  case  where  P  =  p.  The  descendants  of  a  rich  parent  are  more 
fortunate  than  those  of  a  poor  parent,  but  less  and  less  so  the  more  distant  is  the  descendant: 
the  impact  of  the  initial  fortune  of  dynasties  dies  out  over  generations. 

The  more  weight  is  put  on  future  generations,  the  higher  is  P  compared  to  /?,  and  the 
less  intense  is  the  link  between  the  welfare  of  parents  and  child.  But  as  we  will  now  show, 
even  the  smallest  amount  of  mean-reversion  in  the  form  oi  P  >  P  puts  enough  limits  on 
the  transmission  of  shocks  across  generations  to  prevent  the  distribution  of  consumption  and 
welfare  from  exploding. 
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The  first-order  conditions  (28)-(29)  imply  that 

^k'{w{9))  =  l-Xc'{u{9))  and  ^k' (v)  =  1  ~  \c' {u_) ,  (32) 

where  u_  should  be  interpreted  as  the  previous  period's  assignment  of  utility  from  consump- 
tion. Substituting  relation's  in  equation  (32)  into  the  CLAR  equation  (30)  we  arrive  at  a 
Modified  Inverse  Euler  equation 

The  left-hand  side  together  with  the  first  term  on  the  right-hand  side  is  the  standard  inverse 
Euler  equation.  The  second  term  on  the  right-hand  side  is  novel,  since  it  is  zero  when  (3  =  j5 
and  is  strictly  negative  when  (3  >  (5.'^ 

We  now  show  that  a  steady  state  exists  whenever  the  welfare  criterion  places  direct  weight 
on  children  so  that  (3  >  (5.  The  proof  follows  Farhi  and  Werning  (2005)  quite  closely,  which 
proves  such  a  result  for  an  economy  with  taste  shocks. 

Proposition  5.  (a)  There  exists  an  invariant  distribution  ip*  for  the  Markov  process  {vt} 
implied  by  g'^ .  Moreover  any  invariant  distribution  tp*  has  a  support  bounded  away  from 
misery  v.  (b)  Suppose  that  limu-*oo  sup  c" (u) / c' (u)  <  oo,  then  any  invariant  distribution 
necessarily  has  a  support  bounded  away  from,  v. 

An  invariant  distribution  always  exists,  but  when  absolute  risk  aversion  is  bounded,  so 
that  limu^ooSupc"(u)/c'('u)  <  oo,''  the  invariant  distribution  has  a  compact  support,  that  is 
bounded  away  from  misery.  It  follows  directly  that  the  allocation  implied  by  the  invariant 
distribution  has  consumption  and  work  effort  that  are  bounded  above.  This  ensures  that  the 
invariant  ip*  is  also  a  steady  state  of  the  original  planning  problem,  for  some  endowment  level 

The  result  relies  heavily  on  the  force  for  mean  reversion  that  is  behind  equation  (31)  and 
equation  (33).  To  see  this  mean-reversion  force  most  clearly  consider,  as  an  example,  the 
logarithmic  utility  case,  u{c)  =  log(c).    Then  l/u'{c)  =  c  and  equation  (33)  can  be  written 


^Farhi,  Kocherlakota  and  Werning  (2005)  show  that  this  equation,  and  its  implications  for  estate  taxation, 
generahze  to  an  economy  with  capital  and  an  arbitrary  process  for  skills. 

*  This  is  the  case  for  most  common  preference  specifications,  such  as  CARA  or  CRRA  utility  functions. 

^  Indeed,  the  proof  of  this  result  actually  shows  that  promised  continuation  utility  Vt  is  bounded  for  all 
realizations  of  the  shocks,  starting  from  any  vq  in  the  bounded  support.  It  follows  that  promised  utility  t't  is 
bounded  for  all  reporting  strategies.  This  in  turn  implies  that  the  proposed  allocation  is  incentive  compatible, 
that  is,  that  the  temporary  incentive  constraints  in  equation  (23)  imply  equation  (13)  (see  Theorem  2  in  Farhi 
and  Werning,  2005). 
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with  sequential  time  notation  as 

or  simply 

Ci+i  =  -Q  +  (  1  -  -  j  c  +  et+i        with  Et[e(+i]  =  0 

where  c  =  A""^  is  average  consumption  at  the  steady-state  cross-sectional  distribution.  As  the 
last  expression  indicates,  with  logarithmic  utility,  consumption  itself  is  autoregressive  with  an 
autoregressive  coefficient  equal  to  /3//3  <  1. 

5.3     Tax  Implementation 

Any  allocation  that  is  incentive  compatible  and  feasible,  and  has  strictly  positive  consumption, 
can  be  implemented  by  a  combination  of  taxes  on  labor  income  and  estates.  Here  we  first 
describe  this  implementation,  and  explore  some  features  of  the  optimal  estate  tax  in  the  next 
subsection. 

For  any  incentive-compatible  and  feasible  allocation  {cj'(^*),  yj'(^*)}  we  propose  an  implcr, 
mentation  along  the  lines  of  Kocherlakota  (2004).  In  each  period,  conditional  on  the  history 
of  their  dynasty's  reports  9^~^  and  any  inherited  wealth,  individuals  report  their  current  shock 
6t,  produce,  consume,  pay  taxes  and  bequeath  wealth  subject  to  the  following  set  of  budget 
constraints 

ct  +  ht<  yt{e')  -  Tt{e')  +  (1  -  Tt{e'))Rt-i,tht-i         t  =  0, 1, . . .  (34) 

where  Rt-\,t  is  the  before-tax  interest  rate  across  generations,  and  initially  6_i  =  0.  Individuals 
are  subject  to  two  forms  of  taxation:  a  labor  income  tax  Tt{9^),  and  a  proportional  tax  on 
inherited  wealth  Rt^i^th-i  at  rate  Tt{9*).^ 

Given  a  tax  pohcy  {Tj^(^'),  t^{9^),  y^{9*)},  an  equilibrium  consists  of  a  sequence  of  interest 
rates  {Rt^t+i}',  an  allocation  for  consumption,  labor  income  and  bequests  {c^{9^),  b^{9^)};  and 
a  reporting  strategy  {cr]'{9^)}  such  that:  (i)  {q,  bt,  at}  maximize  dynastic  utility  subject 
to  (34),  taking  the  sequence  of  interest  rates  {Rt^t+i}  and  the  tax  policy  {Tt,  Tt,  yt}  as  given; 
and  (ii)  the  asset  market  clears  so  that  /  E_i[6"(6'*)]  d(p{v)  =  0  for  alH  =  0, 1, . . .  We  say  that 


^In  this  formulation,  taxes  are  a  function  of  the  entire  history  of  reports,  and  labor  income  yt  is  mandated 
given  this  history.  However,  if  the  labor  income  histories  y'^ :  Q*  ^>  E*  being  implemented  are  invertible,  then 
by  the  taxation  principle  we  can  rewrite  T  and  r  as  functions  of  this  history  of  labor  .income  and  avoid  having 
to  mandate  labor  income.  Under  this  arrangement,  individuals  do  not  make  reports  on  their  shocks,  but 
instead  simply  choose  a  budget-feasible  allocation  of  consumption  and  labor  income,  taking  as  given  prices 
and  the  tax  system. 
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a  competitive  equilibrium  is  incentive  compatible  if,  in  addition,  it  induces  truth  telling. 

For  any  feasible,  incentive-compatible  allocation  {cj*,  y'^},  with  strictly  positive  consump- 
tion we  construct  an  incentive-compatible  competitive  equilibrium  with  no  bequests  by  setting 
Tl'{9')  =  yt{e')  -  ct{9')  and 

for  any  sequence  of  interest  rates  {Rt-i^t}-  These  choices  work  because  the  estate  tax  ensures 
that  for  any  reporting  strategy  a,  the  resulting  consumption  allocation  {cJ'((t'(0*))}  with  no 
bequests  6J'(0*)  =  0  satisfies  the  consumption  Euler  equation 


The  labor  income  tax  is  such  that  the  budget  constraints  are  satisfied  with  this  consumption 
allocation  and  no  bequests.  Thus,  this  no-bequest  choice  is  optimal  for  the  individual  regard- 
less of  the  reporting  strategy  followed.  Since  the  resulting  allocation  is  incentive  compatible, 
by  hypothesis,  it  follows  that  truth  telling  is  optimal.  The  resource  constraints  together  with 
the  budget  constraints  then  ensure  that  the  asset  market  clears.^ 

As  noted  above,  in  our  economy  without  capital  only  the  after-tax  interest  rate  matters 
so  the  implementation  allows  any  equilibrium  before-tax  interest  rate  {Rt-i,t}-  In  the  next 
subsection,  we  set  the  interest  rate  to  the  reciprocal  of  the  social  discount  factor,  Rt-i,t  =  P~^- 
This  choice  is  natural  because  it  represents  the  interest  rate  that  would  prevail  at  the  steady 
state  in  a  version  of  our  economy  with  capital. 

5.4      Optimal  Progressive  Estate  Taxation 

In  our  environment,  the  relevant  past  history  is  encoded  in  the  continuation  utility  so  the 
estate  tax  T{9^~^,9t)  can  actually  be  reexpressed  as  a  function  of  Vt{9^~^)  and  dt-    Abusing 
notation  we  then  denote  the  estate  tax  by  Tt{v,9t).     Since  we  focus  on  the  steady-state, 
invariant  distribution,  we  also  drop  the  time  subscripts  and  write  t{v,9). 
The  average  estate  tax  rate  f{v)  is  then  defined  by 

l-n^)  =  E(l-^(^'^)M^)  (36) 


^Since  the  consumption  Euler  equation  holds  with  equality,  the  same  estate  tax  can  be  used  to  implement 
allocations  with  any  other  bequest  plan  with  income  taxes  that  are  consistent  with  the  budget  constraints. 
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Using  the  modified  inverse  Euler  equation  (33)  we  obtain 

fiv)   =   -A-i^'(c_(^))(^-l) 

In  particular,  this  implies  that  the  average  estate  tax  rate  is  negative,  f{v)  <  0,  so  that 
bequests  are  subsidized.  However,  recall  that  before-tax  interest  rates  are  not  uniquely  de- 
termined in  our  implementation.  As  a  consequence,  neither  are  the  estate  taxes  computed 
by  (35).  With  our  particular  choice  for  the  before-tax  interest  rate,  however,  the  tax  rates 
are  pinned  down  and  acquires  a  corrective,  Pigouvian  role.  DifTerences  in  discounting  can 
be  interpreted  as  a  form  of  externalities  from  future  consumption,  and  the  negative  average 
tax  can  then  be  seen  as  a  way  of  countering  these  externalities  as  prescribed  by  Pigou.  In 
our  setup  without  capital,  this  result  depends  on  the  choice  of  the  before-tax  interest  rate. 
However,  the  negative  tax  on  estates  would  be  a  robust  steady-state  outcome  in  a  version  of 
our  economy  with  capital.  ■ 

In  our  model  it  is  more  interesting  to  understand  how  the  average  tax  varies  with  the 
history  of  past  shocks  encoded  in  the  promised  continuation  utility  v.  The  average  tax  is  an 
increasing  function  of  consumption,  which,  in  turn,  is  an  increasing  function  off.  Thus,  estate 
taxation  is  progressive:  the  average  tax  on  transfers  for  more  fortunate  parents  is  higher. 

Proposition  6.  In  the  repeated  Mirrlees  economy,  an  optimal  allocation  with  strictly  positive 
consumption  can  he  implemented  by  a  combination  of  income  and  estate  taxes.  At  a  steady- 
state,  invariant  distribution  ip* ,  the  optim.al  average  estate  tax  f{v)  defined  by  (35)  and  (36) 
is  increasing  in  promised  continuation  utility  v. 

The  progressivity  of  the  estate  tax  reflects  the  mean-reversion  in  consumption.  The  for- 
tunate must  face  lower  net  rates  of  return  so  that  their  consumption  path  decreases  towards 
the  mean.'^ 

6     Concluding  Remarks 

When  only  the  first  generation's  welfare  is  of  concern,  we  obtain  familiar  results  that  echo 
those  obtained  in  intragenerational  settings.  In  particular,  in  our  simple  two-period  economy 
we  recover  Atkinson-Stiglitz's  uniform-taxation  result.  As  a  consequence,  bequests  should  be 
undistorted  and  the  transmission  of  welfare  perfect:  consumption  of  parent  and  child  should 
move  one-for-one.  In  our  infinite-horizon  model,  we  prove  an  immiseration  result  that  parallels 


^Farhi,  Kocherlakota  and  Werning  (2005)  explore  more  general  versions  of  this  result  and  discuss  other 
intuitions. 
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Atkeson-Lucas':  a  dynasty's  consumption  inherits  a  random  walk  property,  inequality  grows 
without  bound  and  everyone  converges  to  misery. 

In  contrast,  when  the  expected  welfare  of  future  generations  is  taken  into  account,  the 
planner  values  insuring  children  against  the  family  they  are  born  into.  We  characterize  efficient 
allocations  and  study  the  role  that  estate  taxation  can  play  in  implementing  these  allocations. 
We  find  that  the  estate  taS  should  be  progressive  to  ensure  that  consumption  and  welfare 
exhibit  mean-reversion  across  generations.  Inequality  is  then  bounded:  a  steady-state  cross- 
section  for  consumption  and  welfare  exists. 

Farhi,  Kocherlakota  and  Werning  (2005)  explore  some  extensions — by  including  physical 
capital  accumulation,  modeling  life-cycle  elements  and  allowing  skills  to  be  correlated  across 
generations — and  show  that  the  main  result  on  progressive  estate  taxation  holds.  However, 
a  number  of  issues  are  still  unexplored.  For  example,  the  effects  of  parental  investments  in 
the  child's  human  capital,  of  endogenous  and  variable  fertility,  and  of  intervivo  transfers  all 
remain  open  questions  for  future  research. 
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Appendix 

A     Proof  of  Lemma  1 

Strict  concavity  and  differentiability  follow  from  standard  arguments.  In  order  to  derive  the 
limits  of  k  and  k'  at  the  bounds  of  the  domain,  we  derive  a  lower  bound  A;""'"  and  an  upper 
bound  /c"'^,  for  which  we  can  easily  compute  the  corresponding  limits. 

Consider  the  solution  {u^°{9^),y'"°{0^)]  to  the  relaxed  planning  problem  for  a  given  vq.  For 
all  V  <  vo,  define  {ul'>{d'),y^°{e*)}  by 

ul°{9*)  =  u^°(^*)  for  all  t  >  0 

y^o((?<)  =  y^'<'(^')  for  alH>  1 
Let 

oo 

/^"'"(^O  =  J]/3'^-iK°(0O  -  Ac«n^*))  +  >^yl°{0')  -  dthiy:°{9'))] 
t=o 

Since  {w^°(^*),y^°(^*)}  is  incentive  compatible  and  delivers  welfare  level  v,  we  have  k{v)  > 
fc"""(f),  for  all  V  <  vq.  We  have 


k"'""{v)  =  l-XE 
Hence 


1 


h'{h{y^°{e°)  +  Vo-v) 


lim    k"'""{v)  =  1 

D^  — oo 

Since  k{v)  >  k'^^"{v),  for  all  v  <  Vq  and  both  k  and  fc™"  are  concave,  this  implies  that 

lim   k'{v)  <  1 

D— »  — oo 

Next  define 

oo 

k{v)  =  sup^p'E.MG')  -  Xciu{9'))  +  \y{9')  -  9th{y{9'))] 
t=o 

s.t. 

oo 

v  =  Y,l3'E_,[u{0')-d'h{y{9'))\ 
This  corresponds  to  the  relaxed  planning  problem,  but  without  the  incentive  constraints. 
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Hence  we  have  k{v)  <  k{v). 
Let 


m  =  maxu  —  \c{u)  +  Ay  —  Oh{y) 

u,y,9 


Then 


k{v)  <  sup^/3*E_i[u(^')  -  Xc{u{9'))  +  \y{9')  -  eth{y{d'))]  +  m 

4=0 

<  i;  +  sup  i  ^  /3*^_i  [-\c{u{e'))  +  Ay(^0]  [  +  m 


1 


t=o 


1  1 


VI -p     1-/9. 


Hence  if  we  define 


s.t 


C(^;)=inf5^/3*£_i[cK0'))-y(^*) 


t=o 


^  =  X;/3'£;_i[u(^*)-^tMy(^*))] 


t=0 


and 


we  have 


k'^^{v)  =  v-  C{v)  +  m 


1  1 


1-/3       1-/?J 


kiv)  <  /c'"^'^(z;) 


Denote  by  {u'-^ {9^ ,  v) ,  y^ {9* ,  v)}  the  solution  of  the  program  defining  C.  Combining  the  first 
order  conditions  for  u{9^)  and  the  envelope  theorem,  we  get 


c'{u^{9\v))  =  C'{v)  for  alH>0 
1 


9th'{yC{9^,v)) 


——-  =  C'{v)  for  all  t  >  0 

^      Till 


This  implies  that 


lim    C'{v)  =  0 

•^—00 

lim   u^{9\v)  =u 

■>  — CX) 

im    y^{9*,v)  =  00 


Hence 


lim    k'^^'iv)  =  1 
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Since  k  <  k"^^^  and  both  A:  and  k""^^  are  concave,  this  imphes  that 

lim   k'{v)  >  1 


k'{v)  <  1 
lim   k'{v)  =  1 


D— »  — OO 


Since  we  already  have 

this  implies  that 

Note  that  we  always  have 

lim  C'(i')  =  +00 

v—>v 

lim  A;"^'(u)  =  -oo 

v—*v 

Since  /c(t')  <  k'^^{v),&nd  both  k  and  k™^  are  concave,  this  implies  that 

limk'{v)  =  — oo 

V—fV 

Finally,  note  that 

lim  ^"'"^^(^')  =  lim  k'^^^iv)  =  -oo 

V—V  V—'V 

Hence 

lirn  k{v)  =  lim  k{v)  =  — oo 

V — *v  v—*v 

B     Proof  of  Proposition  4 

In  order  to  characterize  the  optimal  allocation  it  is  convenient  to  study  a  relaxed  problem. 
The  Lagrangian  theorem  guarantees  that  there  exists  a  unique  sequence  of  multipliers  {qt} 
with  go  =  1  on  (25)  such  that  solving  (24)  is  equivalent  to  solving  the  following  program: 

t>o      ■^     $*■ 
subject  to  (26)  and  (27).  Note  that  this  problem  is  equivalent  to  the  minimization  v  by  v  of 

25 


subject  to 

U{{di],a*-I3)  >  U{{dl},a;p)  for  all  a 

Hence  C{v;  {qt})  is  the  least  possible  cost  of  an  incentive  compatible  allocation  delivering 
welfare  v  to  the  first  generation.  It  is  trivial  to  see  that  C{v;  {qt})  is  the  solution  of  the 
following  Bellman  equation 

C{v;  {qt+s}s>i)  =  inf  E[c{ue)  -  y{he)  +  qt+iC\we,  {-^}s>2)]  (37) 

subject  to 

V  =  E[ue  +  f3we  -  9hg\ 

U0  +  0100  —  9h.g  >  U0:  +  (5w0i  -  6h0i 

For  future  use,  let  us  denote  by  g'^{v,  6^)  the  continuation  utility  after  a  history  of  shock 
6^  when  the  initial  welfare  entitlement  is  v. 

Suppose  there  exists  an  invariant  distribution  -0*,  and  let  {qt}  be  the  associated  sequence 
of  multipliers.  Since  -0  is  a  state  variable  for  (24),  this  shows  that  qt+\/qt  is  independent  of 
t.  Hence  there  exists  0  <  g  <  1  such  that  qt+ijqt  =  q  for  all  t.  We  can  therefore  drop  the 
time  dependence  on  the  sequence  {qt}  in  Ct[v;  {qt}),  and  simply  write  C{v)  as  a  shortcut  for 
C{v,{q%>,). 

Lemma  2.  Suppose  there  exists  an  invariant  distribution  ip*  without  full  mass  at  misery. 
Then  q>  (3. 

Proof.  We  will  make  use  of  two  possible  state  variables.  The  first  state  variable  is  the  natural 
one:  v,  promised  future  utility.  The  other  one  is  utility  attained  by  the  previous  generation 
■u_.  Indeed,  from  the  first  order  conditions,  it  is  easy  to  see  that  these  two  state  variables  are 
related  by 

d{u_)  =  ^C'{v) 

The  existence  of  an  invariant  distribution  ip*{'^')  with  not  mass  at  misery  is  equivalent  to  the 
existence  of  an  invariant  distribution  '4)*{u-)  with  no  mass  at  misery. 
Let  xq  =  ug  +  Pw0.  Then  we  can  rewrite  the  Bellman  equation  (37)  as 

C{v)  =  inf  E[c{ue)  -  y{hg)  +  qCiwg)] 

subject  to 

V  =  E[xe  -  9h0] 
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xe  —  Ohe  >  xgi  —  9hgi 
ue  +  Pwg  =  Xe 
Hence,  given  a  value  x  for  xg,  ug  and  wg  are  given  by  the  sub-program 

minc(u)  +  qC{w) 

subject  to 

u  +  (5w  =  X 

The  solution  is  given  by  the  first  order  condition 

c'{x-pw)  +  ^C'{w)  =  0 

Using  the  implicit  function  theorem,  we  can  then  compute 

du  lC"{w) 


dx       ^C"{w)+f5c"{x-Pw) 

Hence 

du 

0  <  —  <  1 

dx 

This  in  turn  implies  that  there  exists  M  >  0  such  that 

max  \ug'  —  ug\  <  M  max  hg 

9,6'  6 

The  first  order  conditions  for  ug  in  in  (37)  imply  that 

^C'(^_)  =  E[c'{ug)] 


Hence 


Therefore, 


and  hence 


^d{u_)  =  E[d{ug)\  <  d{ue) 


log(^)  +  log(c'(u_))<log(c'M) 


log(^)  +  log(c'(^_))  <  log(c'(n_))  +  {   max    ^ 


[Ug  -  U- 
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which  we  can  rewrite  as 

log(f) 

-. r-   <  Ug  -  U- 

(  c"{u)\    —      - 

\rnayiu^[u_,ue}  c'{u)  J 
Hence  for  all  ^  £  0, 

log(^) 

'^  M  max  hgi  <  ug  —  u.. 


(  c"(u)\  gi^Q 

In  order  to  allow  for  bunching  in  (37),  it  is  convenient  to  consider  the  following  program 

inf  y'p„{c(ii„)  -  y(/i„)  +  gC('u;„)} 

n 

n 
-6nK  +  Ur,  +  (3Wn   >   -9nhn+l  +  U^+l  +  Pw^  +  i   (OT  U  =   1,2,  .  .  .  ,  K  -  1, 

This  problem  and  its  notation  require  some  discussion.  We  do  not  incorporate  the  monotonic- 
,ity  constraint  on  h.  But  this  notation  allows  us  to  consider  bunching  in  the  following  way.  If 
any  set  of  neighboring  agents  is  bunched,  then  we  group  these  agents  under  a  single  index  and 
let  pn  be  the  total  probability  of  this  group.  Likewise  let  9n  represent  the  conditional  average 
of  9  within  this  group,  which  is  what  is  relevant  for  the  promise- keeping  constraint  and  the 
objective  function.  Let  9n  be  the  shock  of  the  highest  agent  in  the  group.  The  incentive 
constraint  must  rule  the  highest  agent  in  each  group  from  deviating  and  taking  the  allocation 
of  the  group  above  him. 

Of  course,  every  combination  of  bunched  agents  leads  to  a  different  program.  The  op- 
timal allocation  of  our  problem  must  solve  one  of  these  programs  with  a  strictly  monotone 
allocation — since  bunching  can  be  characterized  by  regrouping  agents.  Thus,  below  we  char- 
acterize solutions  to  these  programs  with  strict  monotonicity  of  the  solution. 

The  first  order  conditions  for  /i„  is 


^n 


y'{hn)  =  C'{v)9n  +  9nfin,n+l  ~  6'„_i^i„_i,„. 

This  implies  in  particular  that  at  the  optimum,  for  any  of  these  programs  (and  hence  for  the 
program  solved  by  the  true  optimal  allocation), 

y'ihi)>C'{v)9. 

It    is  easy  to  verify  that  C  >  C,  where  C  is  the  solution  to  (37)  without  the  incentive 
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compatibility  constraints.  Let  v  be  the  upper  bound  of  the  domain  for  v.  Since  both  C  and 
C  are  increasing  and  convex,  and  since 

\\mC[v)  =  oo     and     Jim  C'{v)  =  oo 

II — ^V  V — *oo 

we  have 


hrnC(f)  =  oo     and     lirnC'(i')  =  oo 


Therefore, 


\imy'{hg{v))  =  oo      =>     limhe{v)  =  0 

V — >v  v—*v 

and  since  h0  has  is  decreasing  in  6, 

lini/ie(i;)  =  Ofor  aU  e  e  e 

V — ^v 

But  this  in  turn  impUes  that 

log(^) 

''  <  lim  inf(u0  —  u_) 


/  c"{u)\     "  V- 

(^maxue[u_,„^]  ^,^^^^ 

Suppose  that  q  <  p.  This  implies  that  for  v  or  equivalently  w_  high  enough,  the  policy 
functions  ue  are  all  such  that  ug  >  u^.  This  in  turn  implies  that  ^p*  necessarily  has  a  support 
bounded  away  from  u.  This  in  turn  implies  that 


C'{v)dij*{v)  =   I  c'{u^)dTp*{u_)  <  oo 

Integrating 

f^C'iv)  =  E[C'{we)] 

over  V,  we  get 

I  C'{v)di)*{v)  =  ^  j  C'{v)dr{v) 

Since  0*  doesn't  have  full  mass  at  misery,  we  have  J  C'{v)dip*{v)  >  0.  This  in  turn  implies 
that  P  —  q,  a.  contradiction.  D 
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We  have  therefore  proved  that  q  >  P  at  ip*.  But  then  from  the  equation 

^C'iv)  =  E[C'{we)] 

we  see  that  C'{vt)  is  a  positive  supermartingale.  By  the  martingale  convergence  theorem,  for 
any  initial  value  vq  for  v,  the  sequence  of  random  variables  {vt}  converges  almost  surely  to  a 
random  variable  C^°°  with 

E[Cl°°]  <  C'{v). 

Suppose  that  there  must  exists  a  v*  such  that  Pr(C^'?'  >  0).  We  will  show  that  this  is  not 
possible. 

For  any  realization  9°°  define  the  set  of  periods  where  6t  takes  on  some  particular  value 
^  e  0  as 

Then  since  0  is  finite,  we  have  that  with  probability  one  all  values  of  9  occur  infinitely  often 

Pr{#0e{9°°)  =  oo  for  all  ^  e  0)  =  1. 

Hence  there  exists  an  event  6°°  such  that  C'{g'^{v*,0*-{6°°)))  converges  to  a  positive  and  finite 
value,  and  #0^(0°°)  =  oo  for  all  9  E  Q.  Hence  g'^{v* ,9^{9°°))  converges  to  a  finite  value  w*. 
Since  g'^{v,9)  is  continuous  in  v,  and  i^0g{6°°)  =  oo  this  implies  that  g^{w*,6)  =  w*  for  all 
6  E  Q.  This  implies  that  the  incentive  constraints  are  not  binding  at  ^y*,  a  contradiction. 

Hence  Pr(C^°°  >  0)  =  0  for  all  v.  Therefore  for  all  v,  C'{g'^{v,d^))  converges  almost  surely 
to  0.  This  in  turn  implies  that  the  stochastic  process  C'{vt)  converges  almost  surely  to  0.  This 
implies  that  C'{vt)  converges  in  distribution  to  0.  Since  ip*  is  an  invariant  distribution,  C'{vt) 
is  distributed  as  C'{vq).  This  implies  that  the  distribution  of  C'{vo)  has  full  mass  at  zero,  i.e. 
that  ip*  has  full  mass  at  misery. 

C     Proof  of  Proposition  5 

We  start  with  two  lemmas,  and  then  proceed  to  prove  the  proposition. 
Lemma  3.   The  following  inequalities  hold 

7(1  -  k\v))  +  (^1  -  I)  <  1  -  k'{g^\e,v))  <  7(1  -  k'{v))  +  (l  -  | 

for  all  9  E  Q,  where  the  constants  are  given  by  j  =  (/3//3)  max  {(1  +  ^„  —  E[^  |  9  <  9n])/9n} 
andj={f3/P)  mm{l  +  9n-i-E[9\9>9n]/9n-i}. 

—  2<n<N 
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Proof.  Consider  the  program 

max  y^pnjun  -  Ac(u„)  +  Xy{hn)  -  9nhn  +  Pk{lUn)} 

n 

V  =  ^pn{Un  +  PiUn  -  O-aK) 

n 

-dnK  +  Un  +  PWn  >  -d^hn+l  +  Un+1  +  PWn+1   ioT  U  =  1,2,  .  .  .  ,  K  -  I, 

This  problem  and  its  notation  require  some  discussion.  We  do  not  incorporate  the  monotonic- 
ity  constraint  on  h.  But  this  notation  allows  us  to  consider  bunching  in  the  following  way.  If 
any  set  of  neighboring  agents  is  bunched,  then  we  group  these  agents  under  a  single  index  and 
let  pn  be  the  total  probability  of  this  group.  Likewise  let  On  represent  the  conditional  average 
of  9  within  this  group,  which  is  what  is  relevant  for  the  promise-keeping  constraint  and  the 
objective  function.  Let  9n  be  the  shock  of  the  highest  agent  in  the  group.  The  incentive 
constraint  must  rule  the  highest  agent  in  each  group  from  deviating  and  taking  the  allocation 
of  the  group  above  him. 

Of  course,  every  combination  of  bunched  agents  leads  to  a  different  program.  The  op- 
timal allocation  of  our  problem  must  solve  one  of  these  programs  with  a  strictly  monotone 
allocation — since  bunching  can  be  characterized  by  regrouping  agents.  Thus,  below  we  char- 
acterize solutions  to  these  programs  with  strict  monotonicity  of  the  solution. 

The  first-order  conditions  are 

PniX^y'ihn)  -On  +  A^„}  -  0„fXn  +  ^n-l/^n-l   <  0 
Pr,0k'{vJn)  -  PX}  +  /3{fin  "  fin-l)  =  0 

where,  by  the  envelope  condition  A  =  k'{v). 

Summing  the  first-order  conditions  for  hn,  we  get 

XE[y'ih{9))]  =  1  -  k'{v) 

Summing  up  the  first-order  conditions  for  Wn,  we  get 

E[k'{g-{v,9))]  =  ^k'{v) 

The  first-order  conditions  for  n  =  1  imply 

.^      .,       9i  /^i  Xy'jh,)  ^XE{y'{hg)]    _  1-A 

^  '       9,  p^  9,       -         9,  9, 
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This  implies 


Using 


Pi        Qi  Oi 


k  [wi)  =  -A  -  -  — , 


we  get 


fc'K)  > 


■^-^-('-4 


1  + 


^1    ^1 


/c'(t;)  + 


Oi      01 


Similarly,  writing  the  first-order  conditions  for  n  =  K,  we  get 


(1  _  A)  -  ^/f^zi  =  V  (^^)  >  ^E[c'(/ie)]  _  1  -  A 


This  implies 
Using 


Ok     Pk  Ok 


e 


Pk    ~  0 


K-\ 


-(1-A) 


K 


TK 


0 


K 


0K-1 


P.     ,    PfJ-K-l 


we  get 


k'  [iuk)  < 


P 


ru    \^U7/- 

"~/3 

/V      1 

7 

A- 

-A,/''! 

fK-1. 

_/5 
/3 

^A'    1 
0K--[_ 

k'{v)  +  i 

\  Ok 

_0K-\ 

1 
0K-\. 

For  any  n,  lO/^-  <  zi^„  <  tfi, 


P 


1  +  1-^' 
^1       ^1 


<  A;'  (u;„) 


<e 


1+ 


7/^ 


/3    L  ^A'-l  ^A'-l 


fc'(f)  + 


e 


K 


1 


^A'-l         ^A'-l 


After  rearranging,  we  obtain 


P 


1  + 


^1 


/5 


{l-k'{v))  +  l-^>l-k'{g-{e,v)) 


>e 


1  + 


1 


7k 


p[        0. 


K-\         ^K-\ 


{l-k\v))  +  l 


P 

P 


D 
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By  Lemma  1  we  have  that  \[my^yk'{v)  =  1,  then  using  the  bounds  we  obtain  that 

\imk'ig'"(v,9))  =  ^  <  1  =  limk'{v), 

v^v  0  v-*v 

for  all  Oee. 

The  following  lemma  describes  the  behavior  of  the  optimal  allocation  when  v  goes  to  v. 

Lemma  4.   We  have  g'^{v,  9)  >  u  and  Y\rag'^{y,  6)  =  u,  Ym\g^{v,  9)  =  oo 

V — ^V  V — ^v 

Proof.  Consider  the  program 

max  y^p„{u„  -  Ac(n„)  +  \y{hn)  -  hK  +  ^K'^n)] 

n 

n 

-9nhn  +  Un  +  fiwn  >  -^„/i„+i  +  u„+i  +  /9u'n+i  for  n  =  1,  2,  .  .  .  ,  /"C  -  1, 

The  first  order  condition  for  u„  is  1-Ac'(u„)  =  '^k'{wn).  Hence  1-Ac'(£f"(t;,  9))  =  ^k'{g'"{v,  9)). 
Since  k'{g'^{v,9))  <  j,  we  have,ii„  >  u.  Moreover,  since  \imk'{v,9)  =  3,  we  have 

P  V — ^v  p 


\\mc'{g'"{v,9))  =  0  or  equivalently  Y\m.g^{y,9)  =  u 

That  lim5f''(t',  ^)  =  00  follows  from 

m[y'{g\v,9))]  =  l~k'{v) 

and  Y\m.k'{v)  =  L  D 

Since  the  derivative  k'{v)  is  continuous  and  strictly  decreasing,  we  can  define  the  transition 
function 

Q[x,9)  =  k'\g'^{{kr\x),9)) 

for  all  a;  <  ^  if  utility  is  unbounded  below.   For  any  probability  distribution  //,  let  Tq{ii)  be 
the  probability  distribution  defined  by 


TQ{yi){A)  =  j  l(c?(.,e)eA}  d^{x)  dp  {9) 
for  any  Borel  set  A.  Define 

^^'-  = n 
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For  example,  Tq^ni^x)  is  the  empirical  average  of  {^'(^<)}"=i  O"^^^  ^^^  histories  of  length  n 
starting  with  k'{vo)  =  x.  The  following  lemma  establishes  the  existence  of  an  invariant 
distribution  by  considering  the  limits  of  {7q  „}. 

We  are  now  able  to  prove  a  proposition  that  implies  the  first  part  Proposition  5,  and 
describes  an  algorithm  to  construct  an  invariant  distribution. 

Proposition  7.  For  each  x  <  I  there  exists  a  subsequence  {TQ^^n){^x)}  that  converges  weakly, 
i.e.  in  distribution,  to  an  invariant  distribution  on  (—00, 1)  under  Q. 

Proof.  For  all  ^  G  0 

l\mQ{x,e)=    lim    k'{g'"i9,v))  =  ^  <1. 

Note  that  we  have  a  continuous  transition  function  Q{x,  6) :  (—00, 1)  x  0  ^  (—00,  !)■ 

We  next  show  that  the  sequence  {Tq{5x)}  is  tight,  in  that  for  any  e  >  0  there  exists  a 
compact  set  K^  such  that  Tq{5^){K^)  >  1  —  e,  for  all  n.  The  expected  value  of  the  distribu- 
tion T^{5:,)  is  simply  ¥.^i[k'{vt{d^-^))]  with  x  =  k'{vo)  <  1.  Recall  that  E_i[k'{vt{e^'^))]  = 
{P/pyk'{vo)  ->  0.  This  imphes  that 

mm{0,k'ivo)}<E^,[k'{vtie'-'))] 

<  T5(<5,)(-oo,  -A){-A)  +  (1  -  T^{5^){-cx>,  -A))l 

for  all  A  >  0.  Rearranging, 

ra(4)(-co,-A)<'-;;y°-^> 

Hence  we  can  find  ^^  >  0  such  that 

e 


Define  a^  by 


T^((5,)(-oo,-/l,)<^ 


1  —  Oe  =    sup    Q{x,6) 
xe\A,,i) 


Since  for  all  6  e  Q,     lim  k'{v,d)   <  |  <  1,  we  have  a^  >  0.  In  addition,  for  all  n  >  1, 


II— >— 00 

-rn-l/ 


T^{Sx)  =  Tq^-' {5,)),  so  th8.t 


m5,){l  -  a„  1)  <  T^-\5,)i-c^,A,)  <  - 
Since  we  also  have 
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this  implies 

Taking  K^  —  [A^,  1  —  a.-],  this  impHes  that  {7g(5x)}„>i  is  tight,  and  therefore  {7Q('5x)}n>0! 
is  tight. 

Tightness  imphes  that  there  exists  a  subsequence  Tq  [S^]  that  converges  weakly,  i.e.  in 
distribution,  to  some  probability  distribution  tt  on  (— cxd,  1).  Since  Q(x,  9)  is  continuous  in  x, 
then  Tq{Tq\5j:))  converges  weakly  to  TQ{'n).  But  the  linearity  of  Tq  impHes  that 

and  since  (f){n)  — >  oo  we  must  have  Tq{tt)  =  tt.  D 

Note  that  for  any  invariant  distribution  tt,  Tq^tt)  =  n  implies  that  the  support  of  it  is 
contained  in  (— oo,|].    This  proves  the  second  part  of  Proposition  5.    We  finally  prove  a 


3 

lemma  that  implies  the  last  part  Proposition  5 


Lemma  5.  Suppose  that  lim„^ooSupc"(n)/c'('u)  <  oo.  Then  any  invariant  distribution  ip 
necessarily  has  a  support  bounded  away  from  v. 

Proof.  We  will  make  use  of  two  possible  state  variables.  The  first  state  variable  is  the  natural 
one:  v,  promised  future  utility.  The  other  one  is  utility  attained  by  the  previous  generation 
u_.  Indeed,  from  the  first  order  conditions,  it  is  easy  to  see  that  these  two  state  variables  are 
related  by 

1  -  Xc\u^)  =  ^k'{v) 

The  existence  of  an  invariant  distribution  ip*{v)  with  not  mass  at  misery  is  equivalent  to  the 
existence  of  an  invariant  distribution  ip*{u_)  with  no  mass  at  misery. 
Let  X0  =  uq  +  Pws-  Then  we  can  rewrite  the  Bellman  equation  (21)  as 

k{v)  =  sup  E[ub  -  \c{ug)  -  6he  +  'Xy{hg)  +  Pk{we)] 

subject  to 

V  =  E[xg  -  9he] 

xe  —  Ohg  >  xq'  —  OHqi 

ub  +  (5we  =  xg 
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Hence,  given  a  value  x  for  xq,  uq  and  we  are  given  by  the  sub-program 

maxzi  —  \c{u)  +  pk{vj) 

subject  to 

u  +  Pw  =  X 

The  solution  is  given  by  the  first  order  condition 

1-Xc'{u)  =  ^k'{^)  =  0 
Using  the  implicit  function  theorem,  we  can  then  compute 

,  0  urfx-u\ 


dx       _lk"{^)  +  Xd'{u) 

Hence 

du 

0<  —  <  1 

dx 

This  in  turn  implies  that  there  exists  M  >  0  such  that 

maxlufl/  —  ug\  <  M max  hg 
e,0'  '  '  e 

Consider  the  program  (C).  The  first  order  condition  for  h^  is 

where  A  =  k'{v).  This  implies  that 

y'{he)>kl-k'Ov) 
A 

This  shows  that 

lim  y' {hg{v))  =  oo      =r-     \irnhi{v)  =  0 

V—>V  V—fV 

and  since  hg  has  is  decreasing  in  9, 

lim  hg{v)  =  0  for  ah  6^  e  9 
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The  first  order  condition(33)  implies  that 

c'{u.)>^c'iue_)-X-\^-l) 

which  can  be  rewritten  as 

L'{u^)  +  X-\l-^)>c'{ue) 
/3  /3 

This  in  turn  implies  that  for  all  9  E  Q 


exp^Mmax/ie    max    ^^]  ( L'{u_)  +  X-\l  -  i)]  >  c'{ue) 


Since 

lim/ie(t;)  =Ofor  all^  e  e 


we  have 


/  c"(u)\ 

lim  exp     M  max  hg    max    — — —     =  1  for  all  ^  £  0 

u--*u  y  e  u€luj,ug}    C'[U)  ) 


This  in  turn  proves  that  for  u^  high  enough,  all  the  policy  functions  uq  are  such  that  uq  <  k_. 
Hence  any  invariant  distribution  -0*  necessarily  has  a'  support  bounded  away  from  Ti.  This  is 
equivalent  to  saying  that  any  invariant  distribution  i^}  necessarily  has  a  support  bounded  away 
from  V.  D 
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